Fault-Tolerance Projects at Stanford CRC
نویسندگان
چکیده
This paper describes the fault-tolerant computing research currently active at Stanford University’s Center for Reliable Computing. One focus is on tolerating hardware faults by means of software (software-implemented hardware fault tolerance). This work mainly targets faults caused by radiation induced upsets. An experiment evaluating the techniques that we have developed, is currently running on the ARGOS satellite. Another focus is on fault-tolerance techniques for adaptive computing systems implemented with field-programmable gate arrays (FPGAs).
منابع مشابه
Schrödinger ’ s CRCs ( Fast
I revisit the fault-tolerance of cyclic redundancy checks (CRCs), expanding on the work of Driscoll et al [1]. I introduce the concepts of Schrödinger-Hamming weight and Schrödinger-Hamming distance, and I argue that under a fault model in which stuck-at-one-half or slightly-out-of-spec faults dominate, current methods for computing the fault detection of CRCs may be over-optimistic. Keywords-c...
متن کاملError Detection by Diverse Data and Duplicated Instructions
Errors in computer systems can cause abnormal behavior and degrade data integrity and system availability. Fault avoidance techniques such as radiation hardening and shielding have been the major approaches to protecting the system from transient errors, but these techniques are expensive. Recently, unhardened Commercial Off-The-Shelf (COTS) components have been investigated for a low cost alte...
متن کاملSymbolic Fault Injection
Computer systems that are dependable in the presence of faults are increasingly in demand. Among available fault tolerance mechanisms, software-implemented hardware fault tolerance (SIHFT) is constantly gaining in popularity, because of its cost efficiency and flexibility. Fault tolerance mechanisms are often validated using fault injection, comprising a variety of techniques for introducing fa...
متن کاملPseudorandom BIST: Theory, Simulation and Tester Data
We present experimental results from the Murphy and ELF35 test chip experiments to try to answer the following questions related to pseudo-random BIST: (1) How effective is the probability model relating test escape probability and test length of pseudo-random patterns? (2) How effective is the single stuck-at fault model used in predicting escape probability when pseudo-random patterns of a gi...
متن کاملDependable Adaptive Computing Systems the Stanford Crc Roar Project
We describe architectures and concurrent error detection, fault-location and recovery techniques for designing reconfigurable systems with high availability, data integrity, and protection from temporary, permanent and common-mode failures. These systems can also be used for unmanned remote applications.
متن کامل